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Abstract. The Laser Interferometer Space Antenna (LISA) will produce a data 
stream containing a vast number of overlapping sources: from strong signals generated 
by the coalescence of massive black hole binary systems to much weaker radiation 
form sub-stellar mass compact binaries and extreme- mass ratio inspirals. It has been 
argued that the observation of weak signals could be hampered by the presence of 
loud ones and that they first need to be removed to allow such observations. Here we 
consider a different approach in which sources are studied simultaneously within the 
framework of Bayesian inference. We investigate the simplified case in which the LISA 
data stream contains radiation from a massive black hole binary system superimposed 
over a (weaker) quasi-monochromatic waveform generated by a white dwarf binary. 
We derive the posterior probability density function of the model parameters using an 
automatic Reversible Jump Markov Chain Monte Carlo algorithm (RJMCMC). We 
show that the information about the sources and noise are retrieved at the expected 
level of accuracy without the need of removing the stronger signal. Our analysis 
suggests that this approach is worth pursuing further and should be considered for the 
actual analysis of the LISA data. 



1. Introduction 

The Laser Interferometer Space Antenna (LISA) £[] is expected to detect gravitational 
waves from several massive black hole binary (MBHB) systems in the mass range 
~ 1O 7 M — 1O 4 M during its mission lifetime (see e.g. E] and references therein). 
Such sources produce a characteristic "chirping" waveform [3] that overlaps with the 
great variety of other signals in the LISA data stream, in particular quasi-monochromatic 
radiation generated by double white dwarf binaries (DWD) jSJ Hj and the so-called 
extreme mass ratio inspirals jH]. Radiation from MBHB's will be very strong and 
generate a signal-to-noise ratio in the range ~ 10 2 — 10 4 (HI E3 E| that can easily 
dominate the detector output. It has been suggested that in order to extract full 
information on the other much weaker sources such strong signals need to first be 
removed using an "identify and subtract" analysis scheme. Some work was carried 
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out to develop algorithms ^2] to this end, although a full characterisation of their 
performance is not yet available. 

A quite different approach to LISA data analysis has recently emerged. It is 
primarily motivated by the large (and unknown) number of signals present in the data 
set and is formulated within the framework of Bayesian inference. Such a scheme aims 
to construct at once the posterior probability density functions (PDF's) of the unknown 
model parameters given the data and the priors. A very effective numerical technique 
to construct the PDF's is based on the so-called Reversible Jump Markov Chain Monte 
Carlo algorithm (RJMCMC). The fact that the actual number of signals contained in the 
data set is unknown and needs to be determined yields a "trans-dimensional" problem, 
in which the number of search parameters is dynamically adjusted as a function of the 
signals considered in the model. Applications of (RJ)MCMC algorithms to (somewhat 
simplified) LISA data analysis problems - either pure sinusoids or the monochromatic 
signals modulated by the LISA response - have produced encouraging results [T3J E3 Ell- 
in this paper we apply a MCMC algorithm to the study of overlapping 
etherogeneous sources. In particular, we concentrate on the issue of whether it 
is necessary to remove loud signals (such as those generated by MBHB's) to infer 
information about weaker sources by investigating whether one can simultaneously fit 
the data with a model containing both signals. In order to explore this issue, we 
consider a very simplified case in which LISA data contain two signals superimposed over 
the (instrumental) noise: one from a (strong) MBHB and one from a (weaker) DWD. 
Our approach is based on Bayesian inference implemented by means of an automatic 
Metropolis-Hastings RJMCMC sampler jTHj which tackles for the first time observations 
of in-spiral binaries in LISA. Not surprisingly, we find that this approach does return 
correct information on both signals; our results (although obtained by exploring a very 
limited parameter space) suggest that this approach is worth pursuing further and should 
be considered for the actual analysis of the LISA data. 

The paper is organised as follows: in Section El we introduce the simplified data 
analysis problem that we wish to tackle and the (R J) MCMC approach used for the 
analysis; in Section |3] we discuss the results of the analysis and compare them to what 
is expected theoretically using the Fisher information matrix; Section 0] contains our 
conclusions and pointers to future work. 

2. Bayesian approach to LISA observations 

In this section we discuss a simplified mock LISA data analysis problem to explore 
whether unknown signal parameters can be estimated simultaneously when strong 
sources are present by working in the framework of Bayesian inference. We also 
briefly introduce an automatic Metropolis-Hastings RJMCMC sampler ^H] that we have 
developed to tackle challenges in LISA data analysis. In our analysis we include some of 
the salient features of LISA data analysis, while keeping the problem at a simple level 
in order to ease the numerical implementation and limit the computational burden. 
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We consider the output of a single Michelson observable (in the long wavelength 
approximation) that can be synthesised from the LISA constellation, following [§]. 
The mock data stream consists of the superposition of radiation from a MBHB and 
a DWD over a background noise. The gravitational waveform from the MBHB h B (tj) 
is approximated at the leading Newtonian quadrupole order and can be cast in the 
form PUT]: 

2/3 



hrfo) = A A p (t J/ 



cosx(tj), tj<t c (1) 

JL 

where A is a constant amplitude, A p (tj) the time dependent polarisation amplitude 
of the signal at the detector, /(t/) the instantaneous frequency of the signal at the 
(discrete) time tj, ft a reference constant frequency - that we set to the LISA nominal 
low-frequency cut-off fi = 10~ 4 Hz - and x(tj) the phase at the detector. The signal 
is set to hP = for t > tj. The MBHB signal depends on the vector of unknown 
parameters 

e B = {a , m, t c , O , e N , <p N , e L , 04 , (2) 

where Ai is the chirp mass, t c the time at coalescence, 0o an arbitrary constant reference 
phase and On, (fix, Ol, 4>l are the polar angles that describe the position of the source 
in the sky N and the orientation of the orbital angular momentum L. We model the 
signal h w (tj) from the DWD binary as monochromatic in the source frame; the LISA 
Michelson output can be written as [01 EB] : 

h w (t J )=A' A' p (t 3 ) cos x'iW; (3) 

the signal © depends on the vector of unknown parameters 

8w = {A' , f , <#,, 0' N , (j>' N , 6' L , <p' L } (4) 

where A' and fo are the constant amplitude and emission frequency, respectively, x'(tj) 
the phase at the detector, O an arbitrary constant phase and 9' N , (j)' N , 9' L , <p' L describe 
the vectors N' and L' for the DWD. The noise n(tj) is modelled in the time domain as 
a Gaussian and stationary random process with zero mean and unit variance, cr 2 = 1. 
The discrete datum d(tj) recorded by LISA can therefore be expressed as 

d(tj) = h B ( tj ) + h w (t,) + n(t 3 ) (7 = 1,2,..., AO (5) 

where N is the total number of points in the data set with sampling time At and total 
observation time T = N At. 

The goal of the analysis is to estimate the value of the unknown parameter 
vector (including noise) 0^ = {9k, o~} that describes the signal. Within the Bayesian 
framework, full information is contained in the joint posterior probability density 
function p(k, Qk\{d(tj)}) of the (in principle unknown) model m k {tj\ 9 k ), identified by k 
and characterised by the parameter vector 9k, given the data {d(tj)}. Such a posterior 
PDF can be written using Bayes' theorem as 

v(k 6M(t )\) = p(k)p(Q k \k) P ({d(t,)}\k,e k ) 

E^/p(V)A|»0p({<l(ti)}|^W 
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Table 1. Summary of the source parameters used in the simulations. The noise is 
modelled in the time domain as a stationary Gaussian process with zero mean and 
unit variance. We refer the reader to the text for more details, in particular Eqs. @ 
and for the definition of the parameters. 



where p{k) and p(Qk\ k) are prior probabilities for the model labelled k and its parameters 
9k, and p({d(tj)}\k,Qk) is the likelihood of the data {d(tj)}. The evaluation of the 
posterior © is notoriously difficult; in order to compute Eq. (JUJ) we construct a single 
"across-model" Markov chain with states x = (k, 0^) to sample directly from the 
joint posterior p(k, Qk\{d(tj)}), based on the knowledge of the likelihood and priors. 
For a given model the likelihood can be written as: 

!1 N 2] 
-^EK^')- mfe ^'^] I • ( 7 ) 

We adopt uniform prior distributions within given boundaries for all the parameters 
except noise, where we adopt priors proportional 1/cr. 

In our Reversible Jump (RJ) implementation of the algorithm, the model rrik{tj; 6k) 
is allowed to vary within a set of k models. For the analysis reported in this paper, 
we set k — 1 in the implementation and vary the model by hand in order to reduce 
the computational time. The algorithm generates the (marginalised) posterior PDF's 
for each of the parameters of the chosen model and the noise, using chains with 10 8 
members which we have (empirically) tested is sufficiently long to ensure convergence 
to the target distribution. One of the main problems that we have encountered so far 
is the long turn-around time for the simulations: e.g. our code takes about 1.5 days 
to generate the posterior PDF's on the 16 parameters of the model that describe (JSJ) 
using N = 10 4 data points on Tsunami, our 200 CPU (2GHz class processors) Beowulf 
cluster. A high priority for the immediate future is therefore to improve the efficiency 
of the algorithm. 



3. Results 

We generate in the time domain T = 10 6 seconds of LISA data with sampling time 
At = 100 sec, according to the model (JSJ): the noisy data set contains an in-spiral signal 
from a MBHB just before coalescence and a quasi-monochromatic wave from a DWD. 
The source parameters are chosen as follows. We select the value of the chirp mass (set 
to M = 1.9344 x 1O 6 M ) in such a way that the (Newtonian) lifetime of the MBHB 
at frequency fa = 10~ 4 Hz coincides with the length of the observations. For numerical 
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Figure 1. The posterior probability density functions for four of the parameters that 
describe the MBHB system: amplitude Aq, chirp mass Ai, coalescence time t c and 
initial phase 4>q. The actual values of the signal parameters used in the simulation 
are given in Table ^ The three lines correspond to the MCMC results for the model 
ttib+w (solid line), the model tub (dotted line), and the marginalised PDF's obtained 
from the multi-variate Gaussian distribution that one expects theoretically in the limit 
of signal-to-noise ratio — > oo (dashed line). In the latter we set the mean of the 
distribution to the exact value of the parameter chosen to inject a signal in the mock 
data set. See the text for more details. 



reasons, we actually set the instance of coalescence 1 s after the end of the data set. 
The signal spans the frequency range 10 -4 Hz - 2.3 mHz. The frequency of the DWD is 
set to /o = 5 x Hz. The amplitudes of the signals are chosen so that the optimal 
signal-to-noise ratios correspond to pu — 250 and pr> — 45 for the MBHB and the 
DWD, respectively. The location of the sources and orientation of the orbital angular 
momenta are chosen randomly. The exact values of the source parameters are listed in 
Table d The two gravitational wave signals overlap completely in the time domain and 
(partially) in the frequency domain; note however that the overlap of the waveforms in 
parameter space (see e.g. Section 3 of [IHj for a definition) is only ~ 0.009. 

It is helpful to highlight the motivations of the simplifications that we have 
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introduced in the simulation and the implications on the results. The choice of a shorter 
observation time with respect to the standard 1 yr of observation is determined by the 
computational costs: the time it takes to evaluate the PDF's scales linearly with N. 
The noise is chosen to be white for convenience of implementation. Note that the low 
frequency portion of the radiation for the MBHB becomes more important than it would 
actually be in LISA observations, because the noise does not rise below ~ 1 mHz. This 
assumption alters the actual numerical results of our analysis, but not the behaviour of 
the algorithm. We consider one Michelson output in the low frequency approximation 
(which is appropriate because the MBHB and DWD that we consider here radiate at 
frequencies below a few mHz) for simplicity and speed: the use of the full power of TDI 
(see e.g. JH] and references therein ) will affect the results from a quantitative, but not 
qualitative point of view. 

Given the data set that we have just described, we compute/search for the 
(marginalised) posterior PDF's for each of the model parameters and the noise. We 
consider three different models assuming in each case that noise is also present (while 
the data set does not change): (i) a model consisting of a MBHB and a DWD {rriB+w), 
that corresponds to a truthful representation of the actual signals in the data set c.f. (JHJ), 
hence we search over the vector of 15+1 unknown parameters Qs+w+n = {#b, &w, °~}', 
(ii) a model that contains only a MBHB (mg) implying a search over the vector of 
8+1 unknown parameters Qb+u = {6b,&}', and (iii) a model that contains only a 
DWD {m w ) implying a search over the 7+1 parameter vector Ow+n = {@Wi°~}- Using 
model rriB+w we can explore whether we can study radiation from a DWD even when 
the stronger signal produced by the MBHB is present. Results obtained using either 
tub or mw provide us with useful complementary information about the behaviour 
of this choice of approach to the analysis. In order to quantitatively characterise the 
results, we also compute the variance-covariance matrix which allows us to evaluate the 
joint posterior PDF of the parameters in the limit p — > oo; this yields a multi-variate 
Gaussian distribution centred on the true parameter values and should be compared to 
our numerical evaluation of Eq. EI for the problem at hand. The results of our analysis 
are summarised in Figures [TJ El El and |U 

We consider first the results obtained using the (correct) model for the data, 
that is rriB+w- The first key point to note is that the unknown parameters Qs+w+n 
(which includes the noise) are correctly recovered by our analysis scheme in which we 
simultaneously search for both the strong and weak signal. This result - which is 
however limited just to one choice of source parameters - is an indication that there 
might not be the need a priori to remove very loud signals to study weaker ones. 
We also note that the marginalised posterior PDF's of each parameter are broadly 
consistent with what is predicted theoretically (which assumes p — > oo): the spread of 
the distributions is comparable to the value of the mean square errors as evaluated using 
the variance-covariance matrix, although the PDF's are not exactly Gaussian. This is 
expected in a real analysis process. As far as the individual PDF's are concerned, it 
is interesting to notice that although the observation time is just rs 12 days, one can 
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Figure 2. The posterior probability density functions for three of the DWD 
parameters - amplitude A' , frequency fo and initial phase cj>' - and the variance 
of the noise a. The actual values of the DWD parameters used in the signal injection 
are reported in Tabled The three lines in the plots for A' , fg and (f> correspond to 
the MCMC results for the model itib+w (solid line), the model mw (dotted line), and 
the marginalised PDF's obtained from the multi-variate Gaussian distribution that one 
expects theoretically in the limit of signal-to-noise ratio — > oo (dashed line). In the 
latter we set the mean of the distribution to the exact value of the parameter chosen 
to inject a signal in the mock data set. For the PDF of a the solid line corresponds to 
the model wib+w an< i the dotted lines to the case mw and vtlb- In the computation of 
the theoretical multivariate Gaussian distribution of the source parameters the noise 
level is assumed to be known. 



extract some information about the geometrical parameters that describe the MBHB. 
In general one would not expect this to be the case, because the modulations induced 
by the LISA motion are still small. The result can explained by the fact that the 
signal-to-noise ratio is large and the noise is constant (instead of increasing toward 
lower frequencies). On the other hand, no information can be gained about the source 
position and orientation of the orbital plane for the DWD (the PDF's for the parameters 
9' N , <j)' N , 8' L , <p' L are essentially flat over the entire relevant parameter space, and do not 
show any significant peak): this is easily explained by the fact that / = 5x 10~ 4 Hz 
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Figure 3. Same as for Figure^but for the angular parameters describing the position 
N and orientation L of the MBHB. The actual parameter values are given in Tabled 



and over the short observation time the modulations (dominated by the LISA change of 
orientation) are too small, given the signal-to-noise ratio, to allow the disentangling of 
the geometrical parameters. This can also be easily verified by calculating the variance- 
covariance matrix, which is essentially degenerate for the 7 parameter case of the DWD 
considered in the simulation. In fact, the theoretical results that we show in the figures 
are obtained computing the variance-covariance matrix only for the 3-dimensional case 
in which the parameters are A' , f , <p' . 

It is also instructive to consider the case in which we compute the posterior PDF's 
assuming the wrong models, m# and mw In each case, the source included in the 
model is still identified. However the PDF's are significantly biased away from the value 
corresponding to the parameters characterising the injected signals and the structure of 
the PDF's themselves are affected. The most dramatic change is however in the PDF 
of the variance of the noise, Fig. |2l which is peaked significantly above the actual value 
cr = l. This is a result of the algorithm being forced to assign the extra power from the 
un-searched for source to the noise. Such an effect is naturally stronger for m\y where 
the entire contribution of the loud MBHB is left to be accounted for by the noise. 
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Figure 4. The posterior probability density functions for the four angular parameters 
describing the position N' and orientation V of the DWD. The actual values of the 
DWD parameters used for the signal injection are reported in Table H The two 
lines in the plots correspond to the MCMC results for the model rriB+w (solid line) 
and the model raw (dotted line). Notice that here we do not show the relevant 
marginalised PDF's obtained from the multi-variate Gaussian distribution that one 
expects theoretically in the limit of signal-to-noise ratio — > oo because those parameters 
are essentially degenerate and we computed the Fisher information matrix only for the 
3-dimensional case in which the parameters are Aq, /o, <fi' . 

4. Conclusions 

We have explored the application of Bayesian inference to the analysis of LISA data 
in the presence of gravitational waves emitted by sources of different classes. More 
specifically we have investigated whether the presence of strong radiation from massive 
black hole binary systems indeed prevents, if not previously removed from the data 
set, the study of weaker sources (such as white dwarf binaries). Our approach is to 
"simultaneously fit" the source parameters, by constructing the (marginalised) posterior 
probability density functions of the unknown signal model parameters, via an automatic 
Metropolis-Hastings (RJ) Markov Chain Monte Carlo sampler (T^j. The results of this 
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preliminary study show that this approach is promising and could be suitable for LISA 
data analysis; it allows us to fully exploit the data without the need for adopting a 
recursive approach in which stronger signals are identified and progressively removed 
from the data stream (which is a very delicate process) in order to study weaker ones. 

The results presented in this paper should be regarded only as a promising initial 
step of a more detailed exploration of this important analysis problem within the 
Bayesian framework. In fact, our analysis suffers from a number of limitations - such 
as short integration time, simple waveforms, one Michelson observable, white noise - 
that are derived from the need to keep the numerical implementation at a sufficiently 
simple level and to minimise the computational burden. We are currently extending the 
exploration of this analysis scheme to more realistic situations and, in order to do so, 
improving the efficiency of our code. Nonetheless, the technique that we have presented 
here is general and can be applied to a wide range of analysis challenges provided by 
LISA. 
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